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A STUDY OF THE RELIABILITY OF TEST 
QUESTIONS. 

By C. E. Rogers. 

This paper is in the nature of a supplement to " A Study of 
the Reliability of Test Questions," by Professor George Gailey 
Chambers, in the March number of The Mathematics 
Teacher. It includes results from five groups of students 
secured by the use of the same questions upon which Professor 
Chambers's study was based, except in the case of one group, 
where I added two questions to the original list for a purpose 
which I shall point out later. 

For easy reference I insert the questions as originally pre- 
pared by Professor Chambers. 

1. Do you discover any defects in the following reasoning, and if so, 
explain why it is defective. 

The sidewalk was wet this morning. Therefore it must have rained 
last night. 

II. If all the inhabitants of the Rahib Islands have blue tattoo marks 
on their bodies, then which of the following statements would neces- 
sarily be true, which could not be true, and which might possibly be 
true? 

i. All people who have blue tattoo marks on their bodies are inhabi- 
tants of the Rahib Islands. 

2. Some inhabitants of the Rahib Islands do not have blue tattoo marks 
on their bodies. 

3. No people with blue tattoo marks on their bodies live anywhere 
except on the Rahib Islands. 

4. Some of the inhabitants of the Rahib Islands have blue tattoo 
marks upon their bodies. 

III. A certain club wishes to select the evening for its regular weekly 
meeting which would be most satisfactory to its members. Accordingly 
the secretary wrote to each member, asking what evening would be most 
satisfactory. 

Can you suggest another question which would have been better for 
the secretary to ask? 

IV. If a photographic plate be exposed to X-rays and then developed, 
black marks will be found upon it. 

1. If upon developing a photographic plate you should find black 
marks upon it, what would you conclude? 
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2. Also if you should not find black marks upon it, what would you 
conclude ? 

V. If John agrees to join the football team provided Charles joins it, 
but Charles decided not to join it, what follows about John? If John 
joins, but Charles does not join, is John breaking his agreement? 

The coefficients of correlation of the test results with the 
plane geometry-class records are computed for four of the 
groups. The other group was composed of teachers of second- 
ary mathematics, and I have included the results on the test 
for the purpose of making a comparison of the questions with 
younger and with more mature students. 

The formulae used in this paper are as follows : 

I. r = 2 stn I - p I , where p = 1 — 



II. P.E. = 0.706 



n(n 2 - 1) ' 
1 — r 2 



In these formulae, r is the coefficient of correlation between 
the results of the test and the average of the class marks in 
plane geometry; D is the difference in ranks of the individual 
students in the test and in their plane geometry classes ; n is the 
number of students in the group; and P.E. is the probable 
divergence of the true from the obtained coefficient of cor- 
relation. 

The first group was composed of 29 girls in the Horace Mann 
School, Teachers College, New York. They were for the most 
part third-year high-school students, but their classification was 
by no means uniform as far as their mathematics was concerned, 
as the following analysis will show: 

10 had studied plane geometry 1 year; 
13 had studied plane geometry i year; 

2 had studied plane geometry i year; 

2 had studied plane geometry J year, and dropped it; 

2 had studied plane geometry in other schools, and were credited. 

In examining this group question II. — 4 was inadvertently 
omitted from the list. Judging from the fact that this question 
proved for the other groups to be about average in difficulty, 
its omission was probably not material as far as the results are 
concerned. 
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The following table shows the percentage of the class scoring 
on each question: 

TABLE I. 

Question. Percentage. 

1 67 

II. — I 42 

II.— 2 73 

II— 3 55 

III 17 

IV.— 1 7 

IV,— 2 83 

V 31 

Table II. shows the percentage scoring the corresponding 
number of points: 

TABLE II. 

Points. Percentage. 
8 O 

7 7 

6 7 

5 , 10 

4 34 

3 21 

2 14 

1 3* 

o 3* 

The correlation coefficient between these results and plane 
geometry marks, computed by Formula I., above, is r=.2$, 
with a probable error of .12. 

The low correlation and the comparatively high unreliability 
do not permit any reliable conclusion as to transfer. But as the 
members of this group show so little uniformity in point of time 
spent in plane geometry, as well as in other features of classifica- 
tion, the results are not surprising. As it is the P.E. is a little 
less than one half the coefficient of correlation. 

However, as Professor Chambers has already pointed out, 
the questions were prepared not primarily to test the students 
but to determine the reliability of the questions as a measure of 
ability in general reasoning. 

Referring to Table I., it will be seen that Questions IV. — 1, 
III., and V. proved hardest for this group. 

The second group was made up of 38 girls in the fourth-year 
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high-school grade of the Horace Mann School. Ten of these 
had plane geometry complete during the school year 1912-3, 
and the remaining twenty-eight during the school year 1913-4. 
I gave the test near the close of the first quarter of the school 
year 1914-5. 

Table III. shows the percentage scoring on each question: 

TABLE III. 

Question. Percentage. 
I &2 

II— 1 57 

II.— 2 89 

II.-3 55 

II.— 4 84 

III 42 

IV.-I 34 

IV.— 2 87 

V 37 

Table IV. shows the percentage scoring the corresponding 
number of points : 

TABLE IV. 

Points. Percentage. 

9 3 

8 • 13 

7 18 

6 24 

5 10 

4 24 

3 2i 

2 5i 

1 o 

The coefficient of correlation and its unreliability for this 
group are as follows: r=.J$, and P.E. = x>5. 

I attribute this high correlation and low unreliability in part 
to the opportunity the teachers had for accurately estimating the 
class standing of the students, having them as they did through- 
out the entire course in plane geometry; and to the frequency 
of similar tests in the Horace Mann School, thus tending to 
afford a normal and accurate measure of the class. Of course, 
in addition to these causes, there was unquestionably a transfer 
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in reasoning ability from some source. As to how much came 
from geometry and how much from other subjects can not be 
determined, except in so far as the test questions serve as a 
reliable measure of training given exclusively in geometry. 

As in the first group, Table III. shows questions IV. — I, III., 
and V. to be the hardest for this group also, though not exactly 
in the same order. 

The third group was composed of 84 girls in the third year of 
the Washington Irving High School, New York City. These 
girls began the study of plane geometry at the beginning of the 
second quarter of the school year 1913-4, and at the time of this 
test were at work in Book V. All of them had been together 
during their study of geometry, and probably longer, and in 
computing the correlation I used the average of the class marks 
for three quarters in plane geometry. 

Table V. shows the percentage on each question : 

TABLE V. 

Question. Percentage. 

1 58 

II.— I 65 

II.— 2 93 

II.— 3 66 

II— 4 59 

III 23 

IV.— 1 11 

IV.— 2 74 

V 17 

Table VI. shows the percentage scoring the coresponding 
number of points : 

TABLE VI. 

Points. Percentage. 

7 7 

6 19 

S 36 

4 IS 

3 14* 

2 6 

1 2i 

The coefficient of correlation and its unreliability are as fol- 
lows: r=.3i, and P.E. = .07. 
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This group is probably a more nearly representative class for 
the high schools of our larger cities than any of the other groups. 

Here again the results are similar to those of the two groups 
reported above with respect to the difficult questions. 

The same questions, with the omission of II. — 4, were given 
to a class of 24 young ladies and gentlemen in Teachers College. 
The title of the course is "Education 279," and the course is 
in the nature of a practicum in the teaching of mathematics. 

Table VII. shows the percentage scoring on each question : 

TABLE VII. 

Question. Percentage. 

1 83 

II.— I 62 

II.— 2 83 

II.— 3 ' 6 7 

III 67 

IV.— 1 83 

IV.— 2 83 

V 71 

Table VIII. shows the percentage scoring the corresponding 
number of points : 

TABLE VIII. 

Points. Percentage. 
8 38 

7 8 

6 25 

5 !7 

■4 8 

3 4 

Questions IV. — 1 and III., which were the most difficult for all 
the student groups, were not more difficult than the other ques- 
tions for this adult group. This seemed to confirm my opinion, 
formed from the results of the first test, that question IV. — 1 
involved technical terms so unfamiliar to the average high- 
school student as to largely vitiate its value to test his reasoning 
ability; and that question III. involved the consideration of a 
problem which, while provided for in geometry, would not yield 
readily and naturally except as a result of experience in dealing 
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with similar situations. Accordingly, I prepared two questions 
of the same type as IV. — i and III. I did not substitute these, 
but added them to the original list for the purpose of compar- 
ing results. I numbered these VI. and VII., and stated them 
as follows : 

VI. If a man falls from the top of a three-story building bones of his 
body will be broken. Now, if you should find a man with broken bones 
lying on the sidewalk adjoining a three-story building, what would you 
conclude ? 

VII. A teacher wishes to change the hour of recitation of a certain 
class. He requests each member of the class to write and hand in on 
paper the most convenient hour. Could the student have given any 
better information in assisting the teacher to find a suitable hour? If 
so, what? (The teacher is free to meet the class at any hour.) 

I gave the amended list to 38 young ladies and gentlemen in 
the East Tennessee State Normal School. These students 
would compare in classification approximately to the fourth- 
year high-school grade, being classified for most part as fourth- 
year academic in our school. The average age was a little over 
twenty. Practically all of them had completed plane geometry 
within a year. About one half of them had their geometry in 
our school, while the others were credited with it from other 
schools. 

Table IX. shows the percentage scoring on each question : 



TABLE IX. 

Question. Percentage. 

1 82 

II. — I 84 

II.— 2 89 

n.— 3 76 

11— 4 89 

in 20 

IV.-I U 

IV.— 2 79 

V 42 

VI 68 

vn 55 

Table X. shows the percentage scoring the corresponding 
number of points : 
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TABLE X. 
Points. Percentage. 

ii •••• s 

io 5 

9 27 

8 21 

7 8 

6 14 

5 IS 

4 

3 S 

2 

1 

For the eighteen who had plane geometry in our school I 
computed the correlation of the test results with the class marks, 
which was as follows: r=.$2, with P.E. = .12. 

Comparing Table IX. with the corresponding tables for the 
other groups, I find that the questions which were most difficult 
for the other groups were also most difficult for this group so 
far as the original list is concerned. But it will be noted that 
while only 50 per cent, scored on IV. — 1, 68 per cent, scored on 
VI., which is of the same type but involving more familiar 
terms; and that while 29 per cent, scored on III., 55 per cent, 
scored on VII., which is of the same type but involving a situa- 
tion more in keeping with the experience of the students. 

Professor Chambers suggested in his paper that some of his 
original questions could be modified so as to better serve the 
purpose for which he intended them. In keeping with that sug- 
gestion I am offering questions VI. and VII. as modifications 
of questions IV. — 1 and III. 

All of the groups tested so far, including the one reported by 
Professor Chambers, found question V. much more difficult 
than the average of the list, exclusive of questions IV. — 1 and 
III. I believe, however, that this type of question should be 
included, as it involves an important principle in geometrical 
and general reasoning. Some of the students in " Education 
279 " and also in the State Normal School group made the point 
that the phrase "provided Charles joins" implies that John 
agrees to join " if and only if Charles joins." I do not find any 
authority for such interpretation, and I am inclined to the 
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opinion that the question could not be modified without largely 
impairing its value. Furthermore, I do not think the low scor- 
ing on this question is attributable to any appreciable extent to 
the wording. 

I might add that I have followed Professor Chambers' 
method of scoring and computing in securing the results re- 
ported in this paper. 

East Tennessee State Normal School, 
Johnson City, Tenn. 



